This static report is accompanied by a live R Shiny dashboard allowing one to test different tickers, exclude specific outlier periods, and adjust bootstrap simulation parameters.
Update this link!!!! Launch Interactive DashboardHealthcare expenditure represents a substantial share of economic activity in many high-income countries. In the United States, national health spending is projected to reach 20.3% of Gross Domestic Product (GDP) by 2033, up from 17.6% in 2023 (Keehan et al. 2025). Accurately modelling healthcare utilisation and expenditure is essential for forecasting budgetary pressures, planning service capacity, and designing policies that can meet the needs of an ageing population in an efficient and sustainable way.
The Agency for Healthcare Research and Quality (AHRQ) publishes the Medical Expenditure Panel Survey (MEPS), which provides detailed information on healthcare utilisation, associated expenditures, insurance coverage, and socio-demographic characteristics, all standardised to represent a full calendar year for each respondent (Agency for Healthcare Research and Quality 2023). These data enable one to quantify how healthcare use and costs vary across population groups and to identify factors associated with higher or lower spending, which is valuable for targeting interventions and evaluating potential policy reforms.
From a modelling perspective, healthcare expenditure data pose several
distributional challenges: they are non-negative, highly right-skewed (a few
patients incur massive costs), and contain a large proportion of individuals
with no recorded healthcare useage. Understanding both whether individuals access
healthcare and, conditional on doing so, how much is spent is crucial for
characterising demand and the resulting financial burden. This report uses 2012
MEPS data to investigate two primary outcomes related to physician services: the
number of doctor consultations (dvisit), capturing
healthcare utilisation, and the annual expenditures on doctor visits
(dvexpend), capturing the associated financial cost.
The 2012 MEPS dataset contains 10,638 observations on US adults aged 18–65. Tables 2.1 & 2.2 show the categorical and quantitative variables for both the full sample and a subsample of individuals with positive expenditure.
The covariates include demographic characteristics
(age, gender,
ethnicity, region,
and education), socioeconomic status
(income),
health indicators (BMI, self-reported
health general and mental health,
hypertension, and hyperlipidemia).
The number of non-physician visits (ndvisit) was
retained as a proxy for the latent propensity to seek care.
| Variable | Category | ||
|---|---|---|---|
gender
|
Female | 0.53 | 0.62 |
| Male | 0.47 | 0.38 | |
ethnicity
|
White | 0.70 | 0.71 |
| Black | 0.21 | 0.20 | |
| Native American | 0.01 | 0.01 | |
| Others | 0.09 | 0.08 | |
education_cat_detailed
|
Less than High School | 0.22 | 0.18 |
| High School Graduate | 0.31 | 0.29 | |
| Some College | 0.24 | 0.25 | |
| College Graduate | 0.15 | 0.17 | |
| Post-Graduate | 0.09 | 0.11 | |
region
|
Northeast | 0.15 | 0.16 |
| Midwest | 0.19 | 0.21 | |
| South | 0.39 | 0.38 | |
| West | 0.26 | 0.24 | |
hypertension
|
No | 0.75 | 0.65 |
| Yes | 0.25 | 0.35 | |
hyperlipidemia
|
No | 0.77 | 0.68 |
| Yes | 0.23 | 0.32 |
| Variable | Description | ||
|---|---|---|---|
bmi
|
Body mass index | 28.03 (6.41) | 28.68 (6.79) |
age
|
Age (years) | 40.25 (13.66) | 43.32 (13.55) |
education
|
Education (no. of years) | 12.77 (2.90) | 13.13 (2.81) |
income
|
Income (USD) | 60,817 (51,451) | 65,688 (54,115) |
dvisit
|
Doctor visits | 2.13 (3.63) | 3.94 (4.14) |
ndvisit
|
Non-doctor visits | 0.94 (2.91) | 1.51 (3.62) |
dvexpend
|
Doctor expenditure | 481.07 (1,646.79) | 889.71 (2,156.92) |
ndvexpend
|
Non-doctor expenditure | 159.59 (790.31) | 266.51 (1,038.92) |
Doctor-visit expenditure (dvexpend) is semi-continuous,
characterised by a point mass at zero (45.9% of observations)
and a heavy right skew among positive values. To address this, a two-part
model was used:
For both parts, the choice of covariates was informed by a preliminary Lasso
regression using the full set of candidate predictors, combined with prior
domain knowledge. ndvisit was retained a priori in both
parts as a proxy for latent propensity to use healthcare services. The final
two-part model is shown in Equations (2.1) & (2.2).
\[\begin{align}
% Part 1: Probit
\text{Part 1: }& \nonumber\\
\mathbb{P}\big[\text{dvexpend}_i > 0\big] &= \Phi(\eta_{1i}) \nonumber \\
\eta_{1i} &= \gamma_0 + \gamma_1 \text{Age}_i + \gamma_2 \text{Gender}_i + \gamma_3 \text{BMI}_i \nonumber \\
&\quad + \gamma_4 \text{Ethnicity}_i + \gamma_5 \text{Region}_i + \gamma_6 \text{Education}_i \nonumber \\
&\quad + \gamma_7 \text{General}_i + \gamma_8 \text{Mental}_i \nonumber \\
&\quad + \gamma_9 \text{Hypertension}_i + \gamma_{10} \text{Hyperlipidemia}_i \nonumber \\
&\quad + \gamma_{11} \text{Income}_i + \gamma_{12} \text{ndvisit}_i \tag{2.1} \\[10pt]
% Part 2: Gamma
\text{Part 2: }& \nonumber\\
\mathbb{E}\big[\text{dvexpend}_i &\mid \text{dvexpend}_i > 0\big] = \mu_i \nonumber \\
\ln(\mu_i) &= \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Gender}_i \nonumber \\
&\quad + \beta_3 \text{Region}_i + \beta_4 \text{Education}_i \nonumber \\
&\quad + \beta_5 \text{General}_i + \beta_6 \text{Mental}_i \nonumber \\
&\quad + \beta_7 \text{Hypertension}_i + \beta_8 \text{Hyperlipidemia}_i \nonumber \\
&\quad + \beta_9 \text{Income}_i + \beta_{10} \text{ndvisit}_i \tag{2.2}
\end{align}\]
Where \(\eta_{1i}\) is the linear predictor for the Probit model,
\(\gamma_j\) are the coefficients for the Probit, and \(\beta_j\) are the
coefficients for the Gamma GLM with a log link.
The count of doctor visits (dvisit) exhibited significant
overdispersion (Variance: 13.1 > Mean:
2.1), violating the assumptions of a standard
Poisson model. A Negative Binomial regression was fitted to
account for the excess variability and lasso screening and prior domain knowledge
were used to inform covariate choice (Equation (2.3)).
A Poisson regression model with the same covariates as the final specification was also fitted and a formal overdispersion test, comparing the Poisson residual deviance to a \(\chi^2\) distribution with the corresponding residual degrees of freedom, yielded a Pearson dispersion of 4.18 (p < 0.001). The Poisson model had a higher Akaike Information Criterion (AIC) than the Negative Binomial model (AIC\(_\text{Poisson}\) = 50386 versus AIC\(_\text{NegBin}\) = 37334), further supporting the justification for a Negative Binomial model.
\[\begin{align} % Count model: Negative Binomial \text{Count model: } \text{dvisit}_i &\sim \text{NegBin}(\mu_i, \kappa) \nonumber \\[4pt] \log(\mu_i) &= \beta_0 + \beta_1 \text{Age}_i + \beta_2 \text{Gender}_i + \beta_3 \text{BMI}_i \nonumber \\ &\quad + \beta_4 \text{General}_i + \beta_5 \text{Mental}_i \nonumber \\ &\quad + \beta_6 \text{Ethnicity}_i + \beta_7 \text{Region}_i \nonumber \\ &\quad + \beta_8 \text{Hypertension}_i + \beta_9 \text{Hyperlipidemia}_i \nonumber \\ &\quad + \beta_{10} \text{Income}_i + \beta_{11} \text{Ndvisit}_i \nonumber \\ &\quad + \beta_{12} \text{Education}_i \tag{2.3} \end{align}\]
To investigate the dependence structure between the frequency of visits and the
intensity of expenditure beyond observed covariates, a copula approach was
employed following the framework of Marra and Radice (2025b). Using Sklar’s Theorem,
the joint cumulative distribution function, \(H(y_1,y_2)\), of utilisation
(\(Y_1\)) and expenditure (\(Y_2\)) can be modelled by coupling their marginal
distributions (\(F_1\) and \(F_2\)) via a copula function \(C\):
\[
H(y_1,y_2) = C(F_1(y_1), F_2(y_2); \theta)
\]
where \(\theta\) is the association parameter. A Gaussian copula was fitted to
the residuals of the marginal models for individuals with positive expendature
using the GJRM package in R and Kendall’s \(\tau\), a measure of
rank correlation, was estimated (Marra and Radice 2025a).
Both the Gamma and Negative Binomial models were fitted with a log link, so their coefficients are interpreted multiplicatively. For a predictor \(x_k\) with coefficient \(\beta_k\), a \(\Delta\)-unit increase in \(x_k\) multiplies the mean (conditional mean cost in the Gamma model and expected visit count in the Negative Binomial model) by \[ \exp(\Delta \beta_k). \] and the percent change is given by \[ 100\left(\exp(\Delta \beta_k)-1\right)\%. \]
| Term | Estimate | Std. Error | p-value | Estimate | Std. Error | p-value |
|---|---|---|---|---|---|---|
| (Intercept) | -0.780 | 0.084 | <0.001 | 5.634 | 0.129 | <0.001 |
| age | 0.008 | 0.001 | <0.001 | 0.009 | 0.002 | <0.001 |
| bmi | 0.006 | 0.002 | 0.009 | |||
| income | 0.000 | 0.000 | <0.001 | 0.000 | 0.000 | 0.002 |
| ndvisit | 0.113 | 0.008 | <0.001 | 0.070 | 0.007 | <0.001 |
| General Health (ref: Excellent) | ||||||
| Poor | 0.761 | 0.111 | <0.001 | 1.090 | 0.160 | <0.001 |
| Fair | 0.538 | 0.063 | <0.001 | 0.598 | 0.112 | <0.001 |
| Good | 0.301 | 0.046 | <0.001 | 0.306 | 0.089 | <0.001 |
| VGood | 0.159 | 0.041 | <0.001 | 0.023 | 0.083 | 0.779 |
| Mental Health (ref: Excellent) | ||||||
| Poor | 0.156 | 0.139 | 0.259 | 0.262 | 0.206 | 0.203 |
| Fair | 0.113 | 0.071 | 0.114 | 0.147 | 0.119 | 0.214 |
| Good | -0.053 | 0.044 | 0.223 | 0.057 | 0.081 | 0.482 |
| VGood | -0.034 | 0.039 | 0.385 | 0.087 | 0.075 | 0.243 |
| Education (ref: Less than High School) | ||||||
| High School Graduate | 0.082 | 0.037 | 0.027 | 0.142 | 0.078 | 0.069 |
| Some College | 0.241 | 0.040 | <0.001 | 0.251 | 0.081 | 0.002 |
| College Graduate | 0.358 | 0.047 | <0.001 | 0.290 | 0.092 | 0.002 |
| Post-Graduate | 0.397 | 0.058 | <0.001 | 0.409 | 0.107 | <0.001 |
| Ethnicity (ref: White) | ||||||
| Black | -0.054 | 0.034 | 0.113 | |||
| Native American | 0.179 | 0.149 | 0.232 | |||
| Others | -0.146 | 0.048 | 0.002 | |||
| Gender (ref: Female) | ||||||
| Male | -0.525 | 0.027 | <0.001 | -0.195 | 0.054 | <0.001 |
| Hyperlipidemia | ||||||
| Yes | 0.445 | 0.037 | <0.001 | 0.138 | 0.063 | 0.029 |
| Hypertension | ||||||
| Yes | 0.360 | 0.036 | <0.001 | 0.097 | 0.063 | 0.128 |
| Region (ref: Northeast) | ||||||
| Midwest | 0.034 | 0.045 | 0.455 | -0.017 | 0.084 | 0.838 |
| South | -0.098 | 0.040 | 0.013 | -0.192 | 0.077 | 0.012 |
| West | -0.178 | 0.043 | <0.001 | 0.006 | 0.082 | 0.945 |
| Term | Estimate | Std. Error | p-value |
|---|---|---|---|
| (Intercept) | -0.689 | 0.092 | <0.001 |
| age | 0.010 | 0.001 | <0.001 |
| bmi | 0.008 | 0.002 | <0.001 |
| income | 0.000 | 0.000 | <0.001 |
| ndvisit | 0.101 | 0.004 | <0.001 |
| General Health (ref: Excellent) | |||
| Poor | 1.178 | 0.095 | <0.001 |
| Fair | 0.812 | 0.064 | <0.001 |
| Good | 0.467 | 0.050 | <0.001 |
| VGood | 0.234 | 0.046 | <0.001 |
| Mental Health (ref: Excellent) | |||
| Poor | 0.588 | 0.119 | <0.001 |
| Fair | 0.368 | 0.068 | <0.001 |
| Good | 0.059 | 0.046 | 0.202 |
| VGood | 0.061 | 0.042 | 0.147 |
| Education (ref: Less than High School) | |||
| High School Graduate | 0.095 | 0.041 | 0.022 |
| Some College | 0.286 | 0.044 | <0.001 |
| College Graduate | 0.387 | 0.051 | <0.001 |
| Post-Graduate | 0.453 | 0.060 | <0.001 |
| Ethnicity (ref: White) | |||
| Black | -0.101 | 0.037 | 0.007 |
| Native American | -0.070 | 0.158 | 0.656 |
| Others | -0.126 | 0.053 | 0.018 |
| Gender (ref: Female) | |||
| Male | -0.629 | 0.029 | <0.001 |
| Hyperlipidemia | |||
| Yes | 0.369 | 0.037 | <0.001 |
| Hypertension | |||
| Yes | 0.318 | 0.037 | <0.001 |
| Region (ref: Northeast) | |||
| Midwest | 0.031 | 0.047 | 0.516 |
| South | -0.133 | 0.043 | 0.002 |
| West | -0.193 | 0.046 | <0.001 |
The two-part model effectively captures the dual processes of access and cost. In the first stage (Probit), females, older adults, and those with chronic conditions (hypertension, hyperlipidemia) were significantly more likely to incur expenditure (Table 3.1). In the second stage (Gamma GLM), conditional on seeking care, significant disparities in expenditure emerged. A 10-year increase in age is associated with approximately a 9% increase in conditional costs. Interestingly, males incur approximately 17.7% lower costs than females (p < 0.001). This finding aligns with work by Bertakis et al. (2000) who suggests women have higher healthcare engagement and diagnostic usage during reproductive years and for preventive screenings.
Self-reported general and mental health showed a strong effect with individuals who report “Poor” health incurring 197% higher costs than those in “Excellent” health. These patterns are consistent with clinical expectations - individuals who are sicker and/or more health-engagedboth enter the system more often and incur higher costs once they do so.
The two-part model achieves an Root Mean Square Error of $1599 and a Mean Absolute Error of $553. The predictive performance was robust at the aggregate level, with the total predicted cost within 1.8% of the actual total, although the model underestimates extreme outliers (Figure 3.1).
Figure 3.1: Distribution of prediction errors (Actual Cost − Predicted Cost) by expenditure category. The boxplots display the median and interquartile range of errors for patients grouped by their actual healthcare expenditure and the red dashed line represents a perfect prediction. The large positive errors for the highest cost categories show that the model fails to account for the full magnitude of catastrophic health expenditures.
As shown in Table 3.2, utilisation patterns mirrored
expenditure. Black and “Other” ethnic groups showed significantly lower visit
counts compared to White patients (Incident Rate Ratios < 1), consistent with
structural barriers to access and utilisation documented in the US healthcare
system (Waidmann and Rajan 2000; Macias-Konstantopoulos et al. 2023). Non-physician visits
(ndvisit) were positively associated with doctor visits
(approximately 11% increase per
visit), suggesting complementarity rather than substitution between provider
types.
The copula analysis of positive spenders revealed a moderate-to-strong positive dependence between frequency and severity. This implies that patients who have more doctor’s appointments also tend to have higher-than-expected costs per visit, suggesting a compounding resource burden for patients with higher health care demands.
As shown in Figure 3.2, there is a clear positive association between the number of visits and total expenditure. While some mechanical correlation is expected (more visits naturally equal more cost), the estimated Gaussian copula parameter (Kendall’s \(\tau =\) 0.522) indicates a dependence that extends beyond simple accumulation. This finding is in agreement with recent work by Marra and Radice (2025b) who used the same MEPS data, and also found that ‘heavy’ users are distinct not only in their visit frequency but also in their resource consumption intensity. This supports the use of a joint frequency-severity modelling framework over independent models.
Figure 3.2: Joint distribution of healthcare utilisation and expenditure for patients with at least one doctor’s visit. The vertical banding reflects the discrete nature of visit counts and the y-axis shows total doctor-visit expenditure (dvexpend) on a logarithmic scale to accommodate skewness. The overlaying positive trend highlights that expenditure increases with visit frequency. The model-estimated Kendall’s $ au$ quantifies the residual dependence, confirming that frequency and severity are positively correlated even after adjusting for covariates.
This analysis highlights the complex drivers of healthcare demand. The two-part expenditure model confirms that the decision to seek care and the resulting cost intensity are driven by overlapping but distinct magnitudes of effect. The significant gender and health-status gaps reinforce the need for risk-adjustment models that explicitly account for biological and systemic usage differences (Bertakis et al. 2000).
The utilisation analysis identified significant ethnic disparities in visit frequency. Even after controlling for insurance proxies (income) and health status, minority groups accessed physicians less frequently. This supports the “Unequal Treatment” hypothesis, where systemic factors influence access independent of clinical need (smedley2003?).
A key limitation of this study is the reliance on observational cross-sectional data, which precludes causal inference. While ndvisit was used as a proxy for care-seeking propensity, it may introduce endogeneity if unobserved health shocks drive both variable sets simultaneously. Furthermore, while the Gamma and Negative Binomial models handled skewness and overdispersion well, the diagnostic plots (Appendix A) show some strain in capturing the most extreme right-tail outliers—a common challenge in medical econometrics.
The copula analysis adds a critical dimension: frequency and severity are not independent. The positive Kendall’s τ suggests that high-frequency users are distinct not just in volume, but in the complexity of resources consumed per visit. Future frequency-severity modelling should therefore avoid independence assumptions to prevent the underestimation of aggregate risk for high-cost patient cohorts.
Tables of coefficients for the expenditure and utilisation models are shown in Table 3.1 and 3.2
For the extensive margin (any expenditure), the Probit model substantially reduces deviance relative to the intercept-only model (from 14,677 to 12,283), corresponding to a likelihood-ratio χ²(25) = 2,394, p < 0.001. For the intensive margin (positive expenditure), the Gamma GLM with log link reduces deviance from 10,693 to 8,991 (χ²(21) = 1,702, p < 0.001). Thus, in both parts the covariates jointly provide a statistically important improvement in fit.
In the Probit model, older age, female gender, higher BMI, worse self-reported general health, the presence of hypertension or hyperlipidemia, higher income, and more non-physician visits (ndvisit) are all associated with a higher probability of incurring any doctor-visit expenditure (all p < 0.01). Residence in the South and West and being in the “Other” ethnic group are associated with a lower probability of any expenditure, suggesting regional and ethnic differences in access or utilisation.
In the Gamma model, coefficients can be interpreted multiplicatively on the mean conditional cost: a coefficient β corresponds to an approximate 100·(e^β − 1)% change in expected expenditure, holding other variables constant. For example, conditional on having any expenditure, males spend around -18% less on doctor visits than females, while individuals reporting fair and poor general health have approximately 82% and 197 higher expected expenditure than those reporting excellent health. A 10-year increase in age is associated with roughly a 9% increase in conditional costs, and each additional non-physician visit corresponds to an estimated 7% increase in doctor-visit expenditure.
However, the error distribution is highly asymmetric: Figure X shows that prediction errors are centred near zero for patients with low or moderate expenditure, but the model tends to under-predict spending in the highest cost categories (≥ $10k), reflecting the difficulty of capturing rare catastrophic costs with a parametric model.
For the count of doctor visits (dvisit), a Negative Binomial regression model
was preferred to a Poisson specification. A formal overdispersion test, based on
comparing the Poisson residual deviance to a χ² distribution with the
corresponding residual degrees of freedom, yielded a Pearson dispersion of
4.18 and p < 0.001. The Poisson
model also had a much higher AIC than the Negative Binomial model
(AIC_Poisson = 5.0386^{4} versus AIC_NegBin =
3.7334^{4}), confirming that the Poisson variance assumption is
not appropriate.
The final Negative Binomial model reduces deviance from 13,791 to 10,439 (χ²(25) = 3,352, p < 0.001), indicating strong joint effects of the covariates. Interpreting the log-link coefficients as incidence rate ratios, a 10-year increase in age is associated with approximately 11% more doctor visits. Consistent with the expenditure model, males have substantially lower utilisation than females (about -47% fewer visits), and worse general health is associated with markedly higher visit frequencies; individuals reporting poor health have an estimated 225% more visits than those in excellent health. Each additional non-physician visit is associated with roughly 11% more doctor visits, illustrating the complementarity between different modes of care.
Ethnic and regional patterns are also evident: Black and “Other” ethnic groups, and residents in the South and West, have significantly fewer visits than otherwise similar White patients in the Northeast, after controlling for health status and socioeconomic factors. This is consistent with documented disparities in access to primary care.
Figure 5.1: Deviance residuals versus fitted values for the Gamma GLM of conditional doctor-visit expenditure. Each point represents a respondent and the dashed line indicates a residual value of zero.
Deviance residuals for the Gamma GLM are broadly centred around zero across the range of fitted expenditures, with increasing spread at higher predicted costs as expected under a Gamma–log specification. A small number of large positive residuals suggest under-prediction for some high-cost individuals, consistent with the model’s tendency to struggle with extreme expenditures.
Figure 5.2: Deviance residuals versus fitted values for the Negative Binomial regression of doctor-visit counts.
Deviance residuals (utilisation) show no strong systematic pattern, with most observations clustered near zero for low fitted visit counts. Variability increases slightly with the fitted mean and a few large positive residuals appear among the highest predicted counts, indicating some under-prediction for the heaviest users but no major violation of the Negative Binomial mean–variance structure.
A key strength of this analysis is the alignment between the data structure and the chosen models. The two-part framework explicitly separates the extensive margin (any use) from the intensive margin (conditional costs), while the Negative Binomial model accommodates overdispersed visit counts. Variable selection was guided by Lasso screening, which helps to identify the most predictive covariates and reduce overfitting, before refitting unpenalised GLMs for interpretation. The models were estimated on a large, nationally representative dataset, and predictive checks suggest good calibration at the population level, with total predicted expenditure closely matching the observed total.
However, several limitations should be noted. First, the models describe associations, not causal effects. The observational nature of MEPS means that unobserved factors—such as underlying disease severity, insurance plan details, or provider characteristics—may confound the relationships between covariates and utilisation or cost. For example, ndvisit is interpreted as a proxy for latent propensity to seek care, but it may also be affected by the same unobserved health shocks that drive doctor visits, introducing simultaneity. Second, the linear predictors assume additive effects on the link scale and do not capture potential non-linearities or interactions (e.g. age by comorbidity); future work could relax this using splines or flexible machine-learning models such as tree-based methods. Third, despite the use of Gamma and Negative Binomial families, the models still struggle to reproduce the extreme right tail of the cost distribution, as evidenced by the large positive errors among the highest-cost patients.
Finally, although residuals and diagnostic plots do not reveal gross mis-specification, the usual GLM assumptions apply at the level of the conditional distribution: independence of observations, correct specification of the mean–variance relationship, and inclusion of the most important predictors. Violations of these assumptions, together with potential omitted variable bias, limit the extent to which conclusions can be generalised or interpreted in a causal sense. Nevertheless, the models provide a coherent descriptive summary of how doctor-visit utilisation and expenditure vary with observable demographic, socioeconomic, and health characteristics.
Limitations, analysis and evaluations
Mention connections between the two here - it’s heavily embedded
A Kendall’s τ≈0.52 τ≈0.52 indicates moderate-to-strong positive dependence between utilisation and expenditure among individuals with positive doctor-visit costs, after adjusting for covariates. In plain terms: Even after controlling for demographics, socioeconomic factors, and health status, individuals who have more doctor visits than expected also tend to incur higher doctor-visit expenditures than expected.
A copula-based joint model for positive spenders estimated substantial residual dependence between doctor-visit frequency and expenditure (Gaussian copula θ^=0.731 θ ^ =0.731, corresponding to Kendall’s τ^≈0.52 τ ^ ≈0.52), suggesting that utilisation and cost intensity are linked beyond shared observed risk factors.
An accessible HTML version of this report is available via a public GitHub page:
This report was created in R Markdown. The source code is open source and is available via:
Changes made to this document were tracked using Git and are also available via the same repository (!TBI).
A bonus interactive dashboard is available via:
The source code for this dashboard is available via the project repository:
## Gamma GLM diagnostics (Part 2)
par(mfrow = c(2, 2))
plot(m_part2) # uses deviance residuals for glm## Negative Binomial diagnostics
par(mfrow = c(2, 2))
plot(m_count) # glm.nb inherits from glm/lm -> same style of plots